Overview

Brought to you by YData

Dataset statistics

Number of variables15
Number of observations279856
Missing cells18272
Missing cells (%)0.4%
Duplicate rows83388
Duplicate rows (%)29.8%
Total size in memory32.0 MiB
Average record size in memory120.0 B

Variable types

Numeric9
Categorical5
Boolean1

Alerts

Dataset has 83388 (29.8%) duplicate rowsDuplicates
Age is highly overall correlated with IncomeHigh correlation
City is highly overall correlated with StateHigh correlation
Credit Score is highly overall correlated with Existing Customer and 3 other fieldsHigh correlation
Employment Profile is highly overall correlated with OccupationHigh correlation
Existing Customer is highly overall correlated with Credit Score and 3 other fieldsHigh correlation
Income is highly overall correlated with AgeHigh correlation
LTV Ratio is highly overall correlated with Profile ScoreHigh correlation
Loan Tenure is highly overall correlated with Credit Score and 3 other fieldsHigh correlation
Number of Existing Loans is highly overall correlated with Credit Score and 3 other fieldsHigh correlation
Occupation is highly overall correlated with Employment ProfileHigh correlation
Profile Score is highly overall correlated with Credit Score and 4 other fieldsHigh correlation
State is highly overall correlated with CityHigh correlation
Occupation has 18272 (6.5%) missing valuesMissing
Number of Existing Loans has 26590 (9.5%) zerosZeros

Reproduction

Analysis started2024-08-04 08:49:54.000181
Analysis finished2024-08-04 08:50:25.060324
Duration31.06 seconds
Software versionydata-profiling vv4.9.0
Download configurationconfig.json

Variables

Age
Real number (ℝ)

HIGH CORRELATION 

Distinct53
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean44.005217
Minimum18
Maximum70
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.1 MiB
2024-08-04T14:20:25.128664image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/

Quantile statistics

Minimum18
5-th percentile20
Q131
median44
Q357
95-th percentile68
Maximum70
Range52
Interquartile range (IQR)26

Descriptive statistics

Standard deviation15.311051
Coefficient of variation (CV)0.34793718
Kurtosis-1.204329
Mean44.005217
Median Absolute Deviation (MAD)13
Skewness-0.0024112255
Sum12315124
Variance234.42828
MonotonicityNot monotonic
2024-08-04T14:20:25.241237image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
55 5540
 
2.0%
23 5502
 
2.0%
53 5472
 
2.0%
36 5459
 
2.0%
41 5451
 
1.9%
40 5447
 
1.9%
52 5441
 
1.9%
56 5403
 
1.9%
70 5389
 
1.9%
24 5383
 
1.9%
Other values (43) 225369
80.5%
ValueCountFrequency (%)
18 5311
1.9%
19 5244
1.9%
20 5226
1.9%
21 5361
1.9%
22 5228
1.9%
23 5502
2.0%
24 5383
1.9%
25 5140
1.8%
26 5354
1.9%
27 5370
1.9%
ValueCountFrequency (%)
70 5389
1.9%
69 5307
1.9%
68 5146
1.8%
67 5311
1.9%
66 5185
1.9%
65 5261
1.9%
64 5298
1.9%
63 5340
1.9%
62 5201
1.9%
61 5169
1.8%

Gender
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.1 MiB
Female
133145 
Male
132749 
Other
13962 

Length

Max length6
Median length5
Mean length5.001415
Min length4

Characters and Unicode

Total characters1399676
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMale
2nd rowMale
3rd rowOther
4th rowFemale
5th rowMale

Common Values

ValueCountFrequency (%)
Female 133145
47.6%
Male 132749
47.4%
Other 13962
 
5.0%

Length

2024-08-04T14:20:25.341269image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-08-04T14:20:25.440753image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
ValueCountFrequency (%)
female 133145
47.6%
male 132749
47.4%
other 13962
 
5.0%

Most occurring characters

ValueCountFrequency (%)
e 413001
29.5%
a 265894
19.0%
l 265894
19.0%
F 133145
 
9.5%
m 133145
 
9.5%
M 132749
 
9.5%
O 13962
 
1.0%
t 13962
 
1.0%
h 13962
 
1.0%
r 13962
 
1.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1399676
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 413001
29.5%
a 265894
19.0%
l 265894
19.0%
F 133145
 
9.5%
m 133145
 
9.5%
M 132749
 
9.5%
O 13962
 
1.0%
t 13962
 
1.0%
h 13962
 
1.0%
r 13962
 
1.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1399676
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 413001
29.5%
a 265894
19.0%
l 265894
19.0%
F 133145
 
9.5%
m 133145
 
9.5%
M 132749
 
9.5%
O 13962
 
1.0%
t 13962
 
1.0%
h 13962
 
1.0%
r 13962
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1399676
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 413001
29.5%
a 265894
19.0%
l 265894
19.0%
F 133145
 
9.5%
m 133145
 
9.5%
M 132749
 
9.5%
O 13962
 
1.0%
t 13962
 
1.0%
h 13962
 
1.0%
r 13962
 
1.0%

Income
Real number (ℝ)

HIGH CORRELATION 

Distinct201
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean76499.164
Minimum9000
Maximum209000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.1 MiB
2024-08-04T14:20:25.546237image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/

Quantile statistics

Minimum9000
5-th percentile21000
Q142000
median68000
Q3104000
95-th percentile160000
Maximum209000
Range200000
Interquartile range (IQR)62000

Descriptive statistics

Standard deviation42875.575
Coefficient of variation (CV)0.56047116
Kurtosis-0.23579952
Mean76499.164
Median Absolute Deviation (MAD)29000
Skewness0.70877886
Sum2.140875 × 1010
Variance1.8383149 × 109
MonotonicityNot monotonic
2024-08-04T14:20:25.791603image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
35000 3112
 
1.1%
37000 3097
 
1.1%
38000 3069
 
1.1%
53000 3005
 
1.1%
44000 2986
 
1.1%
45000 2977
 
1.1%
47000 2964
 
1.1%
42000 2954
 
1.1%
54000 2945
 
1.1%
52000 2937
 
1.0%
Other values (191) 249810
89.3%
ValueCountFrequency (%)
9000 168
 
0.1%
10000 416
 
0.1%
11000 561
 
0.2%
12000 736
0.3%
13000 927
0.3%
14000 1029
0.4%
15000 1192
0.4%
16000 1381
0.5%
17000 1485
0.5%
18000 1477
0.5%
ValueCountFrequency (%)
209000 39
 
< 0.1%
208000 30
 
< 0.1%
207000 15
 
< 0.1%
206000 59
< 0.1%
205000 54
< 0.1%
204000 68
< 0.1%
203000 93
< 0.1%
202000 75
< 0.1%
201000 107
< 0.1%
200000 119
< 0.1%

Credit Score
Real number (ℝ)

HIGH CORRELATION 

Distinct551
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean582.95377
Minimum300
Maximum850
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.1 MiB
2024-08-04T14:20:25.958134image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/

Quantile statistics

Minimum300
5-th percentile309
Q1446
median584
Q3722
95-th percentile850
Maximum850
Range550
Interquartile range (IQR)276

Descriptive statistics

Standard deviation163.07675
Coefficient of variation (CV)0.27974217
Kurtosis-1.1255961
Mean582.95377
Median Absolute Deviation (MAD)138
Skewness-0.038313528
Sum1.6314311 × 108
Variance26594.028
MonotonicityNot monotonic
2024-08-04T14:20:26.101880image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
850 15625
 
5.6%
300 11363
 
4.1%
438 613
 
0.2%
401 607
 
0.2%
412 605
 
0.2%
537 602
 
0.2%
770 600
 
0.2%
756 597
 
0.2%
502 594
 
0.2%
390 590
 
0.2%
Other values (541) 248060
88.6%
ValueCountFrequency (%)
300 11363
4.1%
301 315
 
0.1%
302 290
 
0.1%
303 289
 
0.1%
304 289
 
0.1%
305 295
 
0.1%
306 268
 
0.1%
307 315
 
0.1%
308 312
 
0.1%
309 305
 
0.1%
ValueCountFrequency (%)
850 15625
5.6%
849 258
 
0.1%
848 244
 
0.1%
847 260
 
0.1%
846 207
 
0.1%
845 221
 
0.1%
844 204
 
0.1%
843 216
 
0.1%
842 269
 
0.1%
841 217
 
0.1%

Credit History Length
Real number (ℝ)

Distinct606
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean307.96515
Minimum6
Maximum611
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.1 MiB
2024-08-04T14:20:26.257906image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/

Quantile statistics

Minimum6
5-th percentile36
Q1156
median307
Q3460
95-th percentile581
Maximum611
Range605
Interquartile range (IQR)304

Descriptive statistics

Standard deviation175.08327
Coefficient of variation (CV)0.5685165
Kurtosis-1.2049608
Mean307.96515
Median Absolute Deviation (MAD)152
Skewness0.009495387
Sum86185894
Variance30654.151
MonotonicityNot monotonic
2024-08-04T14:20:26.442538image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
323 572
 
0.2%
127 568
 
0.2%
559 567
 
0.2%
192 554
 
0.2%
59 550
 
0.2%
353 549
 
0.2%
82 542
 
0.2%
27 541
 
0.2%
57 540
 
0.2%
368 540
 
0.2%
Other values (596) 274333
98.0%
ValueCountFrequency (%)
6 468
0.2%
7 519
0.2%
8 461
0.2%
9 435
0.2%
10 459
0.2%
11 460
0.2%
12 440
0.2%
13 467
0.2%
14 422
0.2%
15 460
0.2%
ValueCountFrequency (%)
611 471
0.2%
610 490
0.2%
609 449
0.2%
608 430
0.2%
607 453
0.2%
606 420
0.2%
605 470
0.2%
604 523
0.2%
603 420
0.2%
602 450
0.2%

Number of Existing Loans
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.701693
Minimum0
Maximum10
Zeros26590
Zeros (%)9.5%
Negative0
Negative (%)0.0%
Memory size2.1 MiB
2024-08-04T14:20:26.570661image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12
median5
Q37
95-th percentile10
Maximum10
Range10
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.9803508
Coefficient of variation (CV)0.63388885
Kurtosis-1.1081317
Mean4.701693
Median Absolute Deviation (MAD)3
Skewness0.053826442
Sum1315797
Variance8.8824907
MonotonicityNot monotonic
2024-08-04T14:20:26.674528image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
8 28184
10.1%
4 28059
10.0%
2 27968
10.0%
6 27955
10.0%
5 27950
10.0%
7 27792
9.9%
3 27745
9.9%
0 26590
9.5%
1 24656
8.8%
9 17332
6.2%
ValueCountFrequency (%)
0 26590
9.5%
1 24656
8.8%
2 27968
10.0%
3 27745
9.9%
4 28059
10.0%
5 27950
10.0%
6 27955
10.0%
7 27792
9.9%
8 28184
10.1%
9 17332
6.2%
ValueCountFrequency (%)
10 15625
5.6%
9 17332
6.2%
8 28184
10.1%
7 27792
9.9%
6 27955
10.0%
5 27950
10.0%
4 28059
10.0%
3 27745
9.9%
2 27968
10.0%
1 24656
8.8%

Loan Amount
Real number (ℝ)

Distinct55681
Distinct (%)19.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean105795.34
Minimum5294
Maximum150000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.1 MiB
2024-08-04T14:20:26.808072image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/

Quantile statistics

Minimum5294
5-th percentile33959
Q172173
median111263
Q3150000
95-th percentile150000
Maximum150000
Range144706
Interquartile range (IQR)77827

Descriptive statistics

Standard deviation40458.371
Coefficient of variation (CV)0.3824211
Kurtosis-1.0630462
Mean105795.34
Median Absolute Deviation (MAD)38737
Skewness-0.43901936
Sum2.9607461 × 1010
Variance1.6368798 × 109
MonotonicityNot monotonic
2024-08-04T14:20:26.960060image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
150000 76042
 
27.2%
51258 20
 
< 0.1%
127479 15
 
< 0.1%
56174 15
 
< 0.1%
138647 15
 
< 0.1%
54395 15
 
< 0.1%
104552 15
 
< 0.1%
91361 15
 
< 0.1%
137171 15
 
< 0.1%
137704 15
 
< 0.1%
Other values (55671) 203674
72.8%
ValueCountFrequency (%)
5294 3
< 0.1%
5319 1
 
< 0.1%
5514 3
< 0.1%
5684 2
< 0.1%
5985 3
< 0.1%
6221 3
< 0.1%
6267 3
< 0.1%
6501 2
< 0.1%
6549 3
< 0.1%
6568 3
< 0.1%
ValueCountFrequency (%)
150000 76042
27.2%
149999 3
 
< 0.1%
149998 5
 
< 0.1%
149995 3
 
< 0.1%
149993 2
 
< 0.1%
149992 5
 
< 0.1%
149988 6
 
< 0.1%
149986 4
 
< 0.1%
149985 6
 
< 0.1%
149982 3
 
< 0.1%

Loan Tenure
Real number (ℝ)

HIGH CORRELATION 

Distinct348
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean133.34065
Minimum12
Maximum359
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.1 MiB
2024-08-04T14:20:27.110970image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/

Quantile statistics

Minimum12
5-th percentile22
Q162
median100
Q3201
95-th percentile328
Maximum359
Range347
Interquartile range (IQR)139

Descriptive statistics

Standard deviation96.064132
Coefficient of variation (CV)0.72044144
Kurtosis-0.49576527
Mean133.34065
Median Absolute Deviation (MAD)53
Skewness0.84222386
Sum37316182
Variance9228.3176
MonotonicityNot monotonic
2024-08-04T14:20:27.388453image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
100 1971
 
0.7%
96 1952
 
0.7%
81 1930
 
0.7%
63 1921
 
0.7%
73 1906
 
0.7%
84 1886
 
0.7%
94 1881
 
0.7%
91 1879
 
0.7%
87 1878
 
0.7%
114 1878
 
0.7%
Other values (338) 260774
93.2%
ValueCountFrequency (%)
12 1476
0.5%
13 1443
0.5%
14 1326
0.5%
15 1306
0.5%
16 1351
0.5%
17 1334
0.5%
18 1258
0.4%
19 1326
0.5%
20 1387
0.5%
21 1446
0.5%
ValueCountFrequency (%)
359 458
0.2%
358 474
0.2%
357 441
0.2%
356 380
0.1%
355 466
0.2%
354 507
0.2%
353 442
0.2%
352 434
0.2%
351 440
0.2%
350 442
0.2%

Existing Customer
Boolean

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size273.4 KiB
False
173952 
True
105904 
ValueCountFrequency (%)
False 173952
62.2%
True 105904
37.8%
2024-08-04T14:20:27.509984image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/

State
Categorical

HIGH CORRELATION 

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.1 MiB
Karnataka
28245 
Telangana
28101 
Maharashtra
28095 
Gujarat
28051 
West Bengal
28050 
Other values (5)
139314 

Length

Max length13
Median length11
Mean length8.9965875
Min length5

Characters and Unicode

Total characters2517749
Distinct characters27
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowKarnataka
2nd rowKarnataka
3rd rowUttar Pradesh
4th rowKarnataka
5th rowKarnataka

Common Values

ValueCountFrequency (%)
Karnataka 28245
10.1%
Telangana 28101
10.0%
Maharashtra 28095
10.0%
Gujarat 28051
10.0%
West Bengal 28050
10.0%
Tamil Nadu 28022
10.0%
Kerala 28011
10.0%
Delhi 27996
10.0%
Uttar Pradesh 27713
9.9%
Rajasthan 27572
9.9%

Length

2024-08-04T14:20:27.609622image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-08-04T14:20:27.719911image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
ValueCountFrequency (%)
karnataka 28245
 
7.8%
telangana 28101
 
7.7%
maharashtra 28095
 
7.7%
gujarat 28051
 
7.7%
west 28050
 
7.7%
bengal 28050
 
7.7%
tamil 28022
 
7.7%
nadu 28022
 
7.7%
kerala 28011
 
7.7%
delhi 27996
 
7.7%
Other values (3) 82998
22.8%

Most occurring characters

ValueCountFrequency (%)
a 644023
25.6%
r 195923
 
7.8%
t 195439
 
7.8%
e 167921
 
6.7%
l 140180
 
5.6%
n 140069
 
5.6%
h 139471
 
5.5%
s 111430
 
4.4%
83785
 
3.3%
K 56256
 
2.2%
Other values (17) 643252
25.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2517749
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 644023
25.6%
r 195923
 
7.8%
t 195439
 
7.8%
e 167921
 
6.7%
l 140180
 
5.6%
n 140069
 
5.6%
h 139471
 
5.5%
s 111430
 
4.4%
83785
 
3.3%
K 56256
 
2.2%
Other values (17) 643252
25.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2517749
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 644023
25.6%
r 195923
 
7.8%
t 195439
 
7.8%
e 167921
 
6.7%
l 140180
 
5.6%
n 140069
 
5.6%
h 139471
 
5.5%
s 111430
 
4.4%
83785
 
3.3%
K 56256
 
2.2%
Other values (17) 643252
25.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2517749
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 644023
25.6%
r 195923
 
7.8%
t 195439
 
7.8%
e 167921
 
6.7%
l 140180
 
5.6%
n 140069
 
5.6%
h 139471
 
5.5%
s 111430
 
4.4%
83785
 
3.3%
K 56256
 
2.2%
Other values (17) 643252
25.5%

City
Categorical

HIGH CORRELATION 

Distinct23
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.1 MiB
Kolkata
23900 
New Delhi
23887 
Hyderabad
23726 
Mysuru
 
12227
Udaipur
 
12012
Other values (18)
184104 

Length

Max length18
Median length15
Mean length8.1744576
Min length4

Characters and Unicode

Total characters2287671
Distinct characters37
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMysuru
2nd rowBengaluru
3rd rowKanpur
4th rowBengaluru
5th rowMysuru

Common Values

ValueCountFrequency (%)
Kolkata 23900
 
8.5%
New Delhi 23887
 
8.5%
Hyderabad 23726
 
8.5%
Mysuru 12227
 
4.4%
Udaipur 12012
 
4.3%
Kanpur 12001
 
4.3%
Surat 11969
 
4.3%
Thiruvananthapuram 11966
 
4.3%
Coimbatore 11907
 
4.3%
Ahmedabad 11896
 
4.3%
Other values (13) 124365
44.4%

Length

2024-08-04T14:20:27.831324image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
kolkata 23900
 
7.9%
new 23887
 
7.9%
delhi 23887
 
7.9%
hyderabad 23726
 
7.8%
mysuru 12227
 
4.0%
udaipur 12012
 
4.0%
kanpur 12001
 
4.0%
surat 11969
 
3.9%
thiruvananthapuram 11966
 
3.9%
coimbatore 11907
 
3.9%
Other values (14) 136261
44.9%

Most occurring characters

ValueCountFrequency (%)
a 365947
16.0%
u 180240
 
7.9%
r 172604
 
7.5%
i 136639
 
6.0%
e 135092
 
5.9%
n 132465
 
5.8%
h 108425
 
4.7%
p 88704
 
3.9%
l 84773
 
3.7%
d 83256
 
3.6%
Other values (27) 799526
34.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2287671
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 365947
16.0%
u 180240
 
7.9%
r 172604
 
7.5%
i 136639
 
6.0%
e 135092
 
5.9%
n 132465
 
5.8%
h 108425
 
4.7%
p 88704
 
3.9%
l 84773
 
3.7%
d 83256
 
3.6%
Other values (27) 799526
34.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2287671
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 365947
16.0%
u 180240
 
7.9%
r 172604
 
7.5%
i 136639
 
6.0%
e 135092
 
5.9%
n 132465
 
5.8%
h 108425
 
4.7%
p 88704
 
3.9%
l 84773
 
3.7%
d 83256
 
3.6%
Other values (27) 799526
34.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2287671
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 365947
16.0%
u 180240
 
7.9%
r 172604
 
7.5%
i 136639
 
6.0%
e 135092
 
5.9%
n 132465
 
5.8%
h 108425
 
4.7%
p 88704
 
3.9%
l 84773
 
3.7%
d 83256
 
3.6%
Other values (27) 799526
34.9%

LTV Ratio
Real number (ℝ)

HIGH CORRELATION 

Distinct80874
Distinct (%)28.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean71.643101
Minimum40
Maximum95
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.1 MiB
2024-08-04T14:20:27.915359image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/

Quantile statistics

Minimum40
5-th percentile40
Q158.105848
median72.133017
Q386.239725
95-th percentile95
Maximum95
Range55
Interquartile range (IQR)28.133877

Descriptive statistics

Standard deviation16.865785
Coefficient of variation (CV)0.23541394
Kurtosis-1.0662292
Mean71.643101
Median Absolute Deviation (MAD)14.065639
Skewness-0.18082603
Sum20049752
Variance284.45469
MonotonicityNot monotonic
2024-08-04T14:20:28.024544image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
95 38991
 
13.9%
40 14388
 
5.1%
90.94342996 3
 
< 0.1%
92.20152802 3
 
< 0.1%
69.42563938 3
 
< 0.1%
42.50439875 3
 
< 0.1%
88.03605942 3
 
< 0.1%
77.64399342 3
 
< 0.1%
58.86225002 3
 
< 0.1%
92.53498303 3
 
< 0.1%
Other values (80864) 226453
80.9%
ValueCountFrequency (%)
40 14388
5.1%
40.00789025 3
 
< 0.1%
40.00865185 3
 
< 0.1%
40.00893284 3
 
< 0.1%
40.00915752 3
 
< 0.1%
40.00972416 3
 
< 0.1%
40.01411194 3
 
< 0.1%
40.01493892 3
 
< 0.1%
40.01588615 3
 
< 0.1%
40.01621929 3
 
< 0.1%
ValueCountFrequency (%)
95 38991
13.9%
94.999142 3
 
< 0.1%
94.99794113 3
 
< 0.1%
94.99680815 2
 
< 0.1%
94.99562152 3
 
< 0.1%
94.99476534 3
 
< 0.1%
94.99452662 3
 
< 0.1%
94.99372987 3
 
< 0.1%
94.99132802 3
 
< 0.1%
94.99105575 1
 
< 0.1%

Employment Profile
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.1 MiB
Salaried
136065 
Self-Employed
84369 
Freelancer
22629 
Student
18521 
Unemployed
18272 

Length

Max length13
Median length10
Mean length9.7334844
Min length7

Characters and Unicode

Total characters2723974
Distinct characters20
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSalaried
2nd rowSalaried
3rd rowSalaried
4th rowSelf-Employed
5th rowSalaried

Common Values

ValueCountFrequency (%)
Salaried 136065
48.6%
Self-Employed 84369
30.1%
Freelancer 22629
 
8.1%
Student 18521
 
6.6%
Unemployed 18272
 
6.5%

Length

2024-08-04T14:20:28.133653image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-08-04T14:20:28.216779image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
ValueCountFrequency (%)
salaried 136065
48.6%
self-employed 84369
30.1%
freelancer 22629
 
8.1%
student 18521
 
6.6%
unemployed 18272
 
6.5%

Most occurring characters

ValueCountFrequency (%)
e 427755
15.7%
l 345704
12.7%
a 294759
10.8%
d 257227
9.4%
S 238955
8.8%
r 181323
 
6.7%
i 136065
 
5.0%
y 102641
 
3.8%
o 102641
 
3.8%
p 102641
 
3.8%
Other values (10) 534263
19.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2723974
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 427755
15.7%
l 345704
12.7%
a 294759
10.8%
d 257227
9.4%
S 238955
8.8%
r 181323
 
6.7%
i 136065
 
5.0%
y 102641
 
3.8%
o 102641
 
3.8%
p 102641
 
3.8%
Other values (10) 534263
19.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2723974
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 427755
15.7%
l 345704
12.7%
a 294759
10.8%
d 257227
9.4%
S 238955
8.8%
r 181323
 
6.7%
i 136065
 
5.0%
y 102641
 
3.8%
o 102641
 
3.8%
p 102641
 
3.8%
Other values (10) 534263
19.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2723974
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 427755
15.7%
l 345704
12.7%
a 294759
10.8%
d 257227
9.4%
S 238955
8.8%
r 181323
 
6.7%
i 136065
 
5.0%
y 102641
 
3.8%
o 102641
 
3.8%
p 102641
 
3.8%
Other values (10) 534263
19.6%

Profile Score
Real number (ℝ)

HIGH CORRELATION 

Distinct101
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean77.350155
Minimum0
Maximum100
Zeros858
Zeros (%)0.3%
Negative0
Negative (%)0.0%
Memory size2.1 MiB
2024-08-04T14:20:28.357785image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile28
Q161
median89
Q398
95-th percentile100
Maximum100
Range100
Interquartile range (IQR)37

Descriptive statistics

Standard deviation24.509196
Coefficient of variation (CV)0.31686033
Kurtosis0.041989854
Mean77.350155
Median Absolute Deviation (MAD)11
Skewness-1.0145518
Sum21646905
Variance600.70067
MonotonicityNot monotonic
2024-08-04T14:20:28.457942image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
100 58495
 
20.9%
90 8362
 
3.0%
98 8284
 
3.0%
95 8257
 
3.0%
94 8176
 
2.9%
92 8173
 
2.9%
96 8012
 
2.9%
91 7961
 
2.8%
97 7909
 
2.8%
99 7852
 
2.8%
Other values (91) 148375
53.0%
ValueCountFrequency (%)
0 858
0.3%
1 119
 
< 0.1%
2 137
 
< 0.1%
3 194
 
0.1%
4 189
 
0.1%
5 173
 
0.1%
6 253
 
0.1%
7 273
 
0.1%
8 259
 
0.1%
9 285
 
0.1%
ValueCountFrequency (%)
100 58495
20.9%
99 7852
 
2.8%
98 8284
 
3.0%
97 7909
 
2.8%
96 8012
 
2.9%
95 8257
 
3.0%
94 8176
 
2.9%
93 7751
 
2.8%
92 8173
 
2.9%
91 7961
 
2.8%

Occupation
Categorical

HIGH CORRELATION  MISSING 

Distinct14
Distinct (%)< 0.1%
Missing18272
Missing (%)6.5%
Memory size2.1 MiB
Banker
27760 
Teacher
27356 
Civil Servant
27221 
Software Engineer
27146 
Doctor
26582 
Other values (9)
125519 

Length

Max length22
Median length16
Mean length10.028488
Min length6

Characters and Unicode

Total characters2623292
Distinct characters33
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowDoctor
2nd rowSoftware Engineer
3rd rowBanker
4th rowContractor
5th rowTeacher

Common Values

ValueCountFrequency (%)
Banker 27760
9.9%
Teacher 27356
9.8%
Civil Servant 27221
9.7%
Software Engineer 27146
9.7%
Doctor 26582
9.5%
Shopkeeper 21405
7.6%
Contractor 21090
7.5%
Farmer 20966
7.5%
Business Owner 20908
7.5%
Student 18521
6.6%
Other values (4) 22629
8.1%
(Missing) 18272
6.5%

Length

2024-08-04T14:20:28.590162image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
banker 27760
 
8.0%
teacher 27356
 
7.9%
civil 27221
 
7.8%
servant 27221
 
7.8%
software 27146
 
7.8%
engineer 27146
 
7.8%
doctor 26582
 
7.6%
shopkeeper 21405
 
6.1%
contractor 21090
 
6.1%
farmer 20966
 
6.0%
Other values (9) 94317
27.1%

Most occurring characters

ValueCountFrequency (%)
e 376257
14.3%
r 323638
12.3%
n 224563
 
8.6%
t 188333
 
7.2%
a 168596
 
6.4%
o 160935
 
6.1%
i 119514
 
4.6%
S 94293
 
3.6%
86626
 
3.3%
c 80751
 
3.1%
Other values (23) 799786
30.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2623292
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 376257
14.3%
r 323638
12.3%
n 224563
 
8.6%
t 188333
 
7.2%
a 168596
 
6.4%
o 160935
 
6.1%
i 119514
 
4.6%
S 94293
 
3.6%
86626
 
3.3%
c 80751
 
3.1%
Other values (23) 799786
30.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2623292
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 376257
14.3%
r 323638
12.3%
n 224563
 
8.6%
t 188333
 
7.2%
a 168596
 
6.4%
o 160935
 
6.1%
i 119514
 
4.6%
S 94293
 
3.6%
86626
 
3.3%
c 80751
 
3.1%
Other values (23) 799786
30.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2623292
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 376257
14.3%
r 323638
12.3%
n 224563
 
8.6%
t 188333
 
7.2%
a 168596
 
6.4%
o 160935
 
6.1%
i 119514
 
4.6%
S 94293
 
3.6%
86626
 
3.3%
c 80751
 
3.1%
Other values (23) 799786
30.5%

Interactions

2024-08-04T14:20:22.828469image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:13.031313image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:14.399559image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:15.693226image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:16.811999image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:18.266268image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:19.548747image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:20.630945image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:21.758540image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:22.975883image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:13.177321image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:14.555536image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:15.854580image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:16.969589image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:18.391361image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:19.661243image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:20.745291image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:21.863838image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:23.112038image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:13.324382image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:14.697179image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:16.041894image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:17.141102image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:18.497824image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:19.778295image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:20.882224image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:21.974513image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:23.252371image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:13.436398image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:14.804615image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:16.163377image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:17.332229image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:18.608596image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:19.889999image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:20.995813image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:22.067285image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:23.363906image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:13.576850image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:14.941241image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:16.262487image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:17.513905image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:18.744687image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:19.999613image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:21.109285image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:22.177898image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:23.475308image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:13.711713image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:15.112847image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:16.368151image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:17.657860image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:18.897300image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:20.117592image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:21.231862image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:22.378398image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:23.580227image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:13.891696image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:15.259939image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:16.483235image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:17.819289image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:19.057794image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:20.260659image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:21.359696image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:22.489623image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:23.707742image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:14.122979image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:15.388676image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:16.596411image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:17.979555image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:19.303093image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:20.408365image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:21.491439image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:22.591382image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:23.960090image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:14.257336image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:15.542774image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:16.696461image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:18.141217image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:19.427263image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:20.518339image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:21.626789image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-08-04T14:20:22.703973image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/

Correlations

2024-08-04T14:20:28.681155image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
AgeCityCredit History LengthCredit ScoreEmployment ProfileExisting CustomerGenderIncomeLTV RatioLoan AmountLoan TenureNumber of Existing LoansOccupationProfile ScoreState
Age1.0000.0130.0020.1250.0490.0830.0000.613-0.0400.2620.0630.1230.0300.1000.008
City0.0131.0000.0110.0120.0120.0160.0050.0150.0130.0130.0120.0130.0130.0130.922
Credit History Length0.0020.0111.0000.0020.0090.0130.0000.0030.0010.001-0.0030.0020.0100.0000.007
Credit Score0.1250.0120.0021.0000.0560.9490.0040.208-0.3630.0830.6680.9960.0340.7860.007
Employment Profile0.0490.0120.0090.0561.0000.0580.0040.0890.0190.0300.0260.0531.0000.1220.007
Existing Customer0.0830.0160.0130.9490.0581.0000.0010.1490.4660.0570.6810.9480.0490.6750.006
Gender0.0000.0050.0000.0040.0040.0011.0000.0000.0030.0050.0000.0050.0020.0050.000
Income0.6130.0150.0030.2080.0890.1490.0001.000-0.0670.4250.1070.2060.0520.1660.007
LTV Ratio-0.0400.0130.001-0.3630.0190.4660.003-0.0671.000-0.030-0.239-0.3610.013-0.5530.009
Loan Amount0.2620.0130.0010.0830.0300.0570.0050.425-0.0301.0000.0460.0820.0190.0660.008
Loan Tenure0.0630.012-0.0030.6680.0260.6810.0000.107-0.2390.0461.0000.6640.0170.5380.007
Number of Existing Loans0.1230.0130.0020.9960.0530.9480.0050.206-0.3610.0820.6641.0000.0320.7820.009
Occupation0.0300.0130.0100.0341.0000.0490.0020.0520.0130.0190.0170.0321.0000.0730.010
Profile Score0.1000.0130.0000.7860.1220.6750.0050.166-0.5530.0660.5380.7820.0731.0000.008
State0.0080.9220.0070.0070.0070.0060.0000.0070.0090.0080.0070.0090.0100.0081.000

Missing values

2024-08-04T14:20:24.108278image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
A simple visualization of nullity by column.
2024-08-04T14:20:24.441888image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

AgeGenderIncomeCredit ScoreCredit History LengthNumber of Existing LoansLoan AmountLoan TenureExisting CustomerStateCityLTV RatioEmployment ProfileProfile ScoreOccupation
031Male360006044875109373221NoKarnatakaMysuru90.943430Salaried77Doctor
125Male50000447386215000089NoKarnatakaBengaluru91.135253Salaried43Software Engineer
262Other1780008505031069099110YesUttar PradeshKanpur40.000000Salaried90Banker
369Female460006683496150000148YesKarnatakaBengaluru87.393365Self-Employed86Contractor
452Male1320006015535150000157NoKarnatakaMysuru66.158757Salaried90Teacher
564Female12700085015810108702111YesTamil NaduCoimbatore82.331250Self-Employed92Contractor
629Male1500037889126819108NoUttar PradeshLucknow95.000000Self-Employed25Farmer
730Other82000424610212655092NoWest BengalKolkata93.634577Salaried58Banker
852Male1190007532718150000251YesRajasthanJaipur75.644166Freelancer100Writer
939Male101000575424511325712NoMaharashtraNagpur68.720556Salaried87Banker
AgeGenderIncomeCredit ScoreCredit History LengthNumber of Existing LoansLoan AmountLoan TenureExisting CustomerStateCityLTV RatioEmployment ProfileProfile ScoreOccupation
27984622Female290003281850116226114NoGujaratAhmedabad82.980319Freelancer34Photographer
27984770Male158000621277594575329NoDelhiNew Delhi68.763160Salaried95Software Engineer
27984840Female48000555365412924782NoUttar PradeshManjari95.000000Freelancer60Independent Consultant
27984950Male72000300465015000059NoRajasthanJaipur58.839934Salaried67Banker
27985053Female77000467268311518676NoKeralaThiruvananthapuram60.563183Salaried71Doctor
27985169Male61000495565315000014NoTamil NaduChennai90.300189Salaried71Software Engineer
27985245Female124000850476108034186YesKarnatakaBengaluru78.960607Salaried91Civil Servant
27985333Female71000582560539851101NoKarnatakaMysuru95.000000Unemployed57NaN
27985467Male1910004114812150000111NoDelhiNew Delhi56.109002Salaried69Software Engineer
27985533Other1800047437435152275NoUttar PradeshKanpur95.000000Salaried59Software Engineer

Duplicate rows

Most frequently occurring

AgeGenderIncomeCredit ScoreCredit History LengthNumber of Existing LoansLoan AmountLoan TenureExisting CustomerStateCityLTV RatioEmployment ProfileProfile ScoreOccupation# duplicates
318Female900037521312908257NoTamil NaduChennai94.439057Salaried31Software Engineer3
618Female900048451341704103NoUttar PradeshLucknow68.253740Salaried100Doctor3
918Female900054111944472952NoKeralaThiruvananthapuram82.813109Unemployed44NaN3
1118Female900058552951597085NoWest BengalDhulagori95.000000Self-Employed72Contractor3
3118Female1000051816731675834NoMaharashtraPune69.059854Self-Employed65Farmer3
4518Female11000300293045410102NoMaharashtraNellikuppam94.786607Unemployed24NaN3
5818Female1100075587815477250YesKeralaKochi40.000000Salaried100Doctor3
6418Female1200040757812176514NoMaharashtraNagpur59.530918Self-Employed51Contractor3
6818Female1200047715332709732NoUttar PradeshNellikuppam85.558770Salaried51Software Engineer3
7818Female1300030021104249169NoTamil NaduCoimbatore95.000000Salaried37Civil Servant3